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Abstract 

In this paper we introduce and study a family of complexity functions of infinite words indexed 
by k G Z + U {+oo}. Let k € Z + U {+00} and A be a finite non-empty set. Two finite words u 
and v in A* are said to be £> Abelian equivalent if for all x € A* of length less than or equal 
to k, the number of occurrences of x in u is equal to the number of occurrences of x in v. 
This defines a family of equivalence relations ~/ £ on A*, bridging the gap between the usual 
notion of Abelian equivalence (when k = 1) and equality (when k = +00). We show that the 
number of A;-Abelian equivalence classes of words of length n grows polynomially, although 
the degree is exponential in k. Given an infinite word oj € ^4 N , we consider the associated 

(k) 

complexity function Via ■ N — > N which counts the number of fc-Abelian equivalence classes 
of factors of oj of length n. We show that the complexity function PW is intimately linked 
with periodicity. More precisely we define an auxiliary function q k : N — > N and show that if 
vL k \n) < q k (n) for some k G Z + U {+00} and n > 0, the oj is ultimately periodic. Moreover 
if oj is aperiodic, then V^\n) = q k (n) if and only if oj is Sturmian. We also study A:- Abelian 
complexity in connection with repetitions in words. Using Szemeredi's theorem, we show that 
if oj has bounded k- Abelian complexity, then for every DcN with positive upper density and 
for every positive integer N, there exists a A;- Abelian power occurring in oj at some position 

jeD. 
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1. Introduction 

Abelian equivalence of words has long been a subject of great interest (see for instance 
Erdos problem, @, 0, H, 0, El, 0, 0, 0, [3] ) . Given a finite non-empty set A, let A* denote 



the set of all finite words over A. Two words u and v in A* are Abelian equivalent, denoted 
u ~ a | v, if and only if \u\ a = \v\ a for all a £ A, where \u\ a and \v\ a denote the number of 
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occurrences of a in it and v, respectively. It is readily verified that ~ a ] 3 defines an equivalence 
relation (in fact a congruence) on A*. 

We consider the following natural generalization: Fix k € Z + U {+00}. Two words u and 
v in A* are said to be k-Abelian equivalent, written u ~fc v, if \u\ x = \v\ x for each non-empty 
word x with \x\ < k (where |x| denotes the length of x, and \u\ x and \v\ x denote the number 
of occurrences of x in u and v, respectively). We note that u ~+oo v if and only if u = v, while 
~1 corresponds to the usual notion of Abelian equivalence ~ a j 3 . Thus one may regard the 
notion of A;-Abelian equivalence as gradually bridging the gap between Abelian equivalence 
(k = 1) and equality (k = +00). It is readily verified that ~& defines an equivalence relation 
(in fact a congruence) on A*. Clearly, if u ~^ v, then \u\ = \v\ and u ~^ v for each positive 
integer £ < k. 



The notion of fe-Abelian equivalence was first introduced by the first author in 16] in 
connection with formal languages and decidability questions of various fundamental problems. 
It was shown that the well known Parikh Theorem on the equivalence of Parikh images of 
regular and context-free languages does not hold for /c-abelian equivalence. In contrast various 
highly nontrivial decidability questions including the DOL sequence equivalence problem 



or the Post Correspondence Problem 2j], turned out to be easily decidable in the context 
of fe-Abelian equivalence. Recently /c-Abelian equivalence has been studied in the context 
of avoidance of repetitions in words (see the discussion at the beginning of $5] on A:- Abelian 
powers). In this paper we undergo an investigation of the complexity of infinite words in the 
framework of A:-Abelian equivalence. As is the case with various other notions of complexity 
of words, we will see that k- Abelian complexity is intimately linked with periodicity and can 
be used to detect the presence of repetitions. 

Let A be a finite non-empty set. For each infinite word uj = aQa\a2 ■ ■ ■ with cij € A, we 
denote by J- W (n) the set of all factors of uj of length n, that is, the set of all finite words of 
the form 0^0^+1 • • • aj+ n -i with i > 0. We set 

Pu>(n) = Card(J r a) (n)). 

The function p w : N — > N is called the factor complexity function of uj. Analogously, for each 
k € Z + U {+00} we define 

V^(n) = Card(7Un)/~ fc ). 

(k) 

The function : N — > N, which counts the number of fc-Abelian equivalence classes of 
factors of uj of length n, is called the k-Abelian complexity of uj. In case k = +00 we have that 
T ,< ^ r00 \n) = Puj(n), while if k = 1, Vu\n), denoted p^°(n), corresponds to the usual Abelian 
complexity of uj. 

Most word complexity functions, including factor complexity [23j, maximal pattern com- 
plexity [3], permutation complexity 

una, 

Abelian complexity jj], and Abelian maximal 



pattern complexity [FJ], may be used to detect (and in some cases characterize) ultimately 
periodic words. For instance, a celebrated result due to Morse and Hedlund [23( states that 
an infinite word uj € A N is ultimately periodic if and only if paj(n) < n for some n £ Z + . 
The third author together with T. Kamae proved a similar result in the context of maximal 
pattern complexity with n replaced by 2n — 1 (see [IB]). Furthermore, amongst all aperiodic 
(meaning non-ultimately periodic) words, Sturmian words generally have the lowest possible 



2 



complexitjU. We show that these same results hold in the framework of /c-Abelian complex- 
ity. In order to formulate the precise link between aperiodicity and /c-Abelian complexity, we 
define, for each k € Z + U {+00}, an auxiliary function : N — > N by 



,(*) 



(n) 



n + 1 for n <2k 
2k for n > 2k 



We prove that for u € ^4 N , if Vu (no) < <Z^( n o) f° r some k € Z + U {+00} and no > 1, then 
ijj is ultimately periodic. 

This result is already well known in the special cases k = +00 and k = 1 (see [2^] 
and Q] respectively). By the Morse- Hedlund result mentioned earlier, this condition gives 
a characterization of ultimately periodic words in the special case k = +00. In contrast, k- 
Abelian complexity does not yield such a characterization. Indeed, both Sturmian words and 
the ultimately periodic word 01°° = 0111 • • • have the same constant 2 Abelian complexity. 
More generally, we shall see that the ultimately periodic word o 2fc-1 l°° has the same k- 
Abelian complexity as a Sturmian word. Nevertheless /c-Abelian complexity gives a complete 
characterization of Sturmian words amongst all aperiodic words. More precisely, we prove 
that for an aperiodic word uj € ^4 N , the following conditions are equivalent: 

• u) is a balanced binary word, that is, Sturmian. 

• Vu(ri) = q( k \n) for each k G Z + U {+00} and n > 1. 

Again, the special cases of k = +00 and k = 1 were already known (see 23] and [3] respec- 



tively) . 

Finally we investigate the question of avoidance of k- Abelian N powers: By a /c-Abelian N 
power we mean a word U of the form U = U1U2 ■ ■ ■ Um such that U{ ~^ Uj for all 1 < i, j < N. 
Using Szemeredi's theorem [3p| . we show that if uj has bounded /c-Abelian complexity, then 
for every DcN with positive upper density and for every positive integer N, there exists a 
/c-Abelian iV power occurring in u at some position j € D. 

The paper is organized as follows: In f|2]we recall some basic definitions and notation and 
establish various basic properties of /c-Abelian equivalence of words. Also in f|2] we compute 
the rate of growth of the number of /c-Abelian equivalence classes of words in A n . In $3] we 
develop the link between /c-Abelian complexity and periodicity of words. In £0]we compute the 
/c-Abelian complexity of Sturmian words and show that it completely characterizes Sturmian 
words amongst all aperiodic words. Finally in £j5]we study /c-Abelian complexity in the context 
of repetitions in words. 

2. fc-Abelian equivalence 

2.1. Definitions and first properties 

Given a finite non-empty set A, we denote by A* the set of all finite words over A including 
the empty word, denoted by e, by A + the set of all finite non-empty words over A, by A N 
the set of (right) infinite words over A, and by A z the set of bi-infinite words over A. Given 



3 With respect to maximal pattern complexity, and Abelian maximal pattern complexity, Sturmian words 
are not the only words of lowest complexity. 
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a finite word u = a\02 . . . a n with n > 1 and a\ € A, we denote the length n of u by \u\ (by 
convention we set |e| = 0.) For each x £ A + , we let \u\ x denote the number of occurrences of 
x in u. For u E A*, we denote by u the reverse of u. 

A factor u of w = aoaia2 • • • £ ^4 N is called right special (respectively left special) if there 
exists distinct symbols a, b G A such that both ua and «6 (respectively ait and bu) are factors 
of uj. We say u is bispecial if u is both left and right special. An infinite word oj € A N is 
said to be periodic if there exists a positive integer p such that aj +p = for all indices i. 
It is said to be ultimately periodic if aj+ p = etj for all sufficiently large i. It is said to be 
aperiodic if it is not ultimately periodic. Sturmian words are the simplest aperiodic infinite 
words; Sturmian words are infinite words over a binary alphabet having exactly n + 1 factors 
of length n for each n > 0. Their origin can be traced back to the astronomer J. Bernoulli 
III in 1772. A fundamental result due to Morse and Hedlund [23| states that each aperiodic 
(meaning non-ultimately periodic) infinite word must contain at least n + 1 factors of each 
length n > 0. Thus Sturmian words are those aperiodic words of lowest factor complexity. 
They arise naturally in many different areas of mathematics including combinatorics, algebra, 
number theory, ergodic theory, dynamical systems and differential equations. Sturmian words 
are also of great importance in theoretical physics and in theoretical computer science and 
are used in computer graphics as digital approximation of straight lines. If oj £ {a, 6} N is 
Sturmian, then for each positive integer n there exists a unique right special (respectively left 
special) factor of length n, and one is the reversal of the other. In particular, if x is a bispecial 
factor, the x is a palindrome, i.e., x = x. For more on Sturmian words, we refer the reader to 
0. 

Definition 2.1. Let k € Z + U {+oo}. We say two words u,v £ A + are k-Abelian equivalent 
and write u ~fc v, if \u\ x = \v\ x for all words x of length |x| < k. 

We note that if u, v & A + and \u\ = \v\ < k, then u ~^ v if and only if u = v. 

Example 2.2. The words u = 010110 and v = 011010 are 3-Abelian equivalent but not 4- 
Abelian equivalent since the prefix 0101 of u does not occur in v. The words u = 0110 and 
v = 1101 are not 2-Abelian equivalent (since they are not Abelian equivalent) yet for every 
word x of length 2 we have \u\ x = \v\ x . 

The next lemma gives different equivalent ways of defining /c-Abelian equivalence. For 
example, item ([1]) corresponds to the Definition 12.11 and item ([3]) corresponds to another 
common definition: Words u and v of length at least k — 1 are fe-Abelian equivalent if they 
share the same prefixes and suffixes of length k — 1 and if | 

— \v\x for every word, t of length 

k. 

Lemma 2.3. Let u and v be words of length at least k — 1 and let \u\t = \v\t for every word t 
of length k. The following are equivalent: 

1. \u\ s = \v\ s for all s € A- k ~ l , 

2. \u\ s = \v\ s for all s G A k ~ l , 

3. pref fc _ 1 (u) = pref fc _ 1 (u) and sufffc_i(u) = sufffc_ 1 (t;) ; 
4- Pref fc _i(u) = pref/,^), 
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5. suff fc _i(-u) = sufffc„i(f), 

6. prefj(u) = prefj(f) and sufffc_i_j(u) = suffk-i-i(v ) / or some i G {0, . . . , k — 1}. 

Proof. © ©: Clear. 

© ©: Let {t l3 . . . ,t n } be the multiset of factors of u (and of v) of length k. The 
multiset of factors of u of length A; — 1 is 

{pref fc „i(V)} U {sufffc_i(ii), . . . , suff fc _i(i n )}, 

and the multiset of factors of v of length A; — 1 is 

{pref fc _!(i;)} U {sunVi(ii), • • • , suff fc _i(t n )}. 

These multisets must be the same, so pref fc „ 1 (n) = pref fc _ 1 (w). Similarly, sufffc_i(u) = 
suff fc _i(u). 

© @, ©: Clear. 

gD or © ©: Clear. 

© =^ © : Let {t\, . . . ,t n } be the multiset of factors of u (and of v) of length k. Every 

s G A k - X \ {pref fc _ 1 (ti) 1 Buff ii ,_i(«)} 

appears in the multiset 

{pref fc _i(ti), . . . jpreffc.j^n)} U {suff fc _i(ii), . . . , suff fc _i(t n )} (1) 

2|u| s times. A word s G {pref fe _ 1 (u), sufffc-i(n)} appears 2|it| s — 1 times if pref fc _ 1 (n) 7^ 
sufffc_i(w), and 2|«| s — 2 times if pref fc „ 1 (n) = sufffc_i(«). Similarly, every 

s G A k ~ 1 \ {pref fe _ 1 (t;),suff fe _i(w)} 

appears 2\v\ s times, and a word s G {pref fc _ 1 (u), suff/ c _ 1 (u)} appears 2|u| s — 1 times if 
pref fc „ 1 (f ) 7^ sufffc_i(u), and 2|u| 8 — 2 times if pref fc _ 1 (u) = sufffc_i(-u). 

If some words appear an odd number of times in ©, then these must be pref fc _ 1 («) and 
sufffc_i(n), and they must also be pref fc _ 1 (t>) and suff &_i (v) . If follows that \u\ s = \v\ s for 
every s G A k ~ 1 . (In this case the assumption © was not needed.) 

If all words appear an even number of times in ©, then necessarily pref fc _ 1 (u) = sufffc_i(u) 
and pref fc _ 1 (w) =suffk-i(v)- From © it follows that pref fe _ 1 (u) = pref k _ 1 (v) and suff k-i(u) = 
sufffc_i(u), and thus \u\ s = \v\ s for every s G A h ^ 1 . 

The fact that \u\ s = \v\ s also for every s of length less than k — 1 can be proved in a similar 
way. □ 

The next lemma lists some basic facts on /c-Abelian equivalence: 

Lemma 2.4. Let u,v G A* and k > 1. 

• // \u\ = \v I < 2k — 1 and u ~^ v , then u = v. 

• If u ~fc v , then u ~&/ v for all k' < k. 

• If u\ ~fe vi and U2 ~fc V2, then u\U2 ~fc V\V2- 
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The bound 2 k — 1 in Lemma 12.41 is optimal as for each positive integer k there exist words 
u v of length 2k such that u ~fc v. For example, the words u = 0* — •'•OlCr - 1 and v = 
fc-i 100 fc-i of length 2k are readily verified to be fc-Abelian equivalent (see Proposition |2.8|) ■ 

Lemma 2.5. Fix 2 < k < +oo. Suppose aub ~& cud wii/i a,b,c,d G A anc? u,v E A*. Then 
u v. 

Proof. Let j £ A* with |x| < A; — 1. We can assume that \x\ < \aub\ for otherwise = |tt| x = 
\v\ x . If x is neither a prefix nor a suffix of aub, then by Lemma 12.31 x is neither a prefix nor 
suffix of cvd and hence \u\ x = \aub\ x = \cvd\ x = \v\ x . If x is either a prefix of aub or a suffix of 
aub but not both, the \u\ x = \aub\ x — 1 = |cvd| x — 1 = |«|a.. Finally if x is both a prefix and a 
suffix of aub then |«|a. = |aith|a; — 2 = jcudla; — 2 = |i;| x .. □ 

2.2. A first connection to Sturmian words 

The next theorem gives a complete classification of pairs of fc-Abelian equivalent words of 
length 2k and establishes a first link to Sturmian words: 

Theorem 2.6. Fix a positive integer k, and let u,v € A* be distinct words of length 2k. Then 
u ~fc v if and only if there exist distinct letters a,b € A, a Sturmian word oj G {a, 6} N and a 
right special factor x of oj of length k — 1 (or empty in case k = 1) such that 

u = xabx and v = xbax. 

In particular u and v are both factors of the same Sturmian word ui. 

Remark 2.7. It follows that if u and v are distinct /c-Abelian equivalent words of length 2k, 
then both u and v are on a binary alphabet and in fact factors of the same Sturmian word to. 
In fact, if B is a bispecial factor of u> then both BabB and BbaB are factors of uj. Also, if x is 
a right special factor of oj, then there exists a bispecial factor B of uj with x a suffix of B and 
x a prefix of B. Thus both xabx and xbax are factors of uj. 

We will need the next result applied to Sturmian words, but we prove it more generally 
for episturmian words. We refer the reader to [7|] for the definition and basic properties of 
episturmian words. 

Proposition 2.8. Fix a positive integer k > 2. Let u and v be factors of the same episturmian 
word uj. Then u and v are k-Abelian equivalent if and only if u and v are (k — 1)-Abelian 
equivalent and share a common prefix and a common suffix of length min{\u\,k — 1}. Thus, 
u and v are k-Abelian equivalent if and only if u and v are Abelian equivalent and share a 
common prefix and a common suffix of length min{\u\, k — 1}. 

Proof. One direction follows immediately from Lemma 12.31 Next suppose that u and v are 
{k — 1)-Abelian equivalent factors of the same episturmian word uj, and that u and v share a 
common prefix and a common suffix of length min{|u|, k — 1}. To prove that u ~fc v it suffices 
to show that whenever axb £ J-^{k) (with a,b G A and x £ A*), we have \u\ ax b = \v\ ax b- First 
let us suppose that ax is not a right special factor of uj so that every occurrence in oj of ax is 
a occurrence of axb. Then, if ax is not a suffix of u (and hence not a suffix of v) we obtain 

I U | axb — l^lax — IHaa; — | f | axE> • 
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On the other hand if ax is a suffix of u (and hence also a suffix of v) we have 

l^laxft — |^|ax 1 — 1 1' | as: 1 — Maxb- 

Similarly, in case xb is not a left special factor of oj we obtain \u\ ax i, = \v\ ax b- Thus it remains 
to consider the case when ax is right special in oj and xb is left special in oj. In this case x 
is bispecial and a = b. For each c G A, let n c = \u\ axc and n' c = \v\ axc . We must show that 
n a = n' a . However we know that n c = n' c for all c / a since xc is not left special in oj. Now, if 
ax is not a suffix of u (and hence not a suffix of v) we have 

^ ^ — 1 ax — Max — ^ ^ fl c 

whence n a = n' a . On the other hand if ax is a suffix of u (and hence a suffix of then 

y~l»c = Mace - 1 = Mae ~ 1 = n c 

whence n a = as required. □ 

Remark 2.9. The following example illustrates that the assumption in Proposition 12.81 that 
u and v are factors of the same Sturmian word is necessary: Let u = aabb and v = abab. The 
u and v are Abelian equivalent and share a common prefix and suffix of length 1, yet they are 
not 2- Abelian equivalent. 

Proof of Theorem \2.b\ We start by showing that if oj G {a, 6} N is a Sturmian word, and x a 
right special factor of oj of length k — 1, then u = xaftx and u = x&ax are A;- Abelian equivalent. 
This follows from Proposition 12.81 since u and v share a common prefix and a common suffix 
of lengths k — 1 and are Abelian equivalent. 

Next we suppose that u and u are distinct /c-Abelian equivalent words of length 2k and 
show that both u and u have the required form. We proceed by induction on k. In case k = 1, 
we have that n and v are distinct Abelian equivalent words of length 2 whence u and u may 
be written in the form u = ab and v = ba for some a ^ b in A. 

Next suppose the result of Theorem 12.61 is true for k — 1 and we shall prove it for k. 
So let u and v be distinct /c-Abelian equivalent words of length 2k with k > 1. Then by 
Lemma 12.31 we can write = a'u'b' and u = aV6' for some a', b' G >1 and 6 A* where 
|n'| = = 2(fc — 1) > 2. Since u and t> are distinct, it follows that u' ^ v'. Also, by Lemma f2.5l 
it follows that v! v'. Thus by induction hypothesis, there exist distinct letters a, b € A 

and a Sturmian word oj G {a, 6} N such that -u' and v' are both factors of oj of the form n' = xabx 
and = xbax for some right special factor x of w of length k — 2. 

Thus we can write u = a'xabxb' and v = a'xbaxb' . Since u ~& v, \a'xa\ = k, and a ^ b it 
follows that a'x must occur in ?/ and hence a' G {a, 6}. Similarly we deduce that b' G {a, 6}. 

Let us first suppose that x ^ x. Then a'xa must occur in u' and axb' must occur in u' . 
Hence both a'xa and axb' are factors of w. Moreover, since x ^ x it follows that x is not left 
special in oj and x is not right special in oj. Hence every occurrence of x in oj is preceded by 
a' and every occurrence of x, is oj is followed by 6'. Since the factors of oj are closed under 
reversal, we deduce that a! = b' and a'x is a right special factor of oj. Moreover, since u' and 
v' are both factors of oj beginning in x and ending in x, it follows that u = a'xabxa' and 
v = a'xbaxa' are both factors of oj. 
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Finally suppose x = x so that x is a bispecial factor of uj. We may write the increasing 
sequence of bispecial factors e = Bq, B\, . . . , x = B n , B n+ \, ... so that x is the nth bispecial 
factor of ui. We recall that associated to u is a sequence (dj)j>o € A N (called the directive 
word of oj) defined by dj-Bj is right special in uj. (See for instance [28]). 

Without loss of generality we can suppose that a' = a. We claim b' = a. Suppose to the 
contrary that b' = b. Then both axa and bxb = bxb are factors of v' contradicting that uj is 
balanced. Hence we must have a' = b' = a and so u = axabxa and v = axbaxa. Now x is a 
bispecial factor of the Sturmian word uj. If ax is a right special factor of uj then we are done by 
Remark 12.71 Otherwise, if bx is a right special factor of uj, then this means that a n = b where 
a n is the nth entry of the directive word of uj. Let uj' be a Sturmian word whose directive word 
(h)i>o is defined by 6j = Oj for i n, and 6 n = a. Then 2 is a bispecial factor of uj' and ax is 
a right special factor of uj'. It follows from Remark 12.71 that both u and v are factors of a/. □ 

As an immediate consequence of Theorem 12.61 we have: 

Corollary 2.10. Let u S A* be of the form u = vxabxw where x is a right special factor of 
length k — 1 of a Sturmian word. Set u' = vxbaxw. Then u ~^ v! . 

2.3. The number of k-Abelian classes in A n 

Here we shall estimate the number of /c-Abelian equivalence classes of words in A n . Fix 
k > 1 and let m > 2 be the cardinality of the set A. 

Lemma 2.11. The number of k-Abelian equivalence classes of A n+l is at least as large as the 
number of k-Abelian equivalence classes of A n . 

Proof. If k = l or n<k — 1, then the claim is clear. Otherwise, let B be a set of representatives 
of the fc-Abelian equivalence classes of A n . The set AB has m times as many words as B. To 
prove the theorem, we will show that there can be at most m words in AB that are fc-Abelian 
equivalent. 

Let a € A and let auo, . . . au m £ AB be /c-Abelian equivalent. It needs to be shown that 
some of these words are equal. Two of these words must have the same fcth letter, let these be 
au and av. Because also pref fc _ 1 (ati) = pref fc _ 1 (au), it follows that pref k (au) = pref fe (av). If 
t G A k , then either \u\t = \au\t = \av\t = \v\t (if t 7^ pref fe (an)), or \u\ t = \au\t — 1 = \av\t — 1 = 
\v\t (if t = pref fc (an)). Thus u and v are /c-Abelian equivalent and, by the definition of B, 
u = v. This proves the claim. □ 

Let si,S2 G A k ^ 1 and let 



be the set of words of length n that start with s\ and end with S2- For every word w E 
S(si, S2,n) we can define a function 



If u, v € S2> then n ~^ f if and only if f u = f v . To count the number of /c-Abelian 
equivalence classes, we need to count the number of the functions f w . Not every function 
/ : A k — > {0, . . . , n — k + 1} is possible. It must be 



S(s 1 ,s 2 ,n) = A n n s x A* n A*s 2 



->■ {0, . . . , n - k + 1}, f w (t) = \w\ t . 




t&A k 
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and there are also other restrictions, which are determined in Lemma 12.121 

If a function / : A k — > No is given, then a directed multigraph Gt can be defined as follows: 
the set of vertices is A k ~ l , and if t = sia = bs2, where a,b £ A, then there are f(t) edges 
from s\ to $2- If / = fwi then this multigraph is related to the Rauzy graph of w. In the next 
lemma, deg~ denotes the indegree and deg + the outdegree of a vertex in Gf. 

Lemma 2.12. For a function f : A k —> No and words s\,S2 £ A k ~ 1 , the following are 
equivalent: 

(i) there is a number n and a word w € S(si, S2,n) such that f = f w , 

(ii) there is an Eulerian path from s\ to S2 in Gf, 

(Hi) the underlying graph of Gf is connected, except possibly for some isolated vertices, and 
deg~(s) = deg + (s) for every vertex s, except that if s\ ^ S2, then deg~(si) = deg + (si) — 1 
and deg~(s 2 ) = deg + (s 2 ) + 1, 

(iv) the underlying graph ofGf is connected, except possibly for some isolated vertices, and 

Y^f(as) = Y,f{sa) + c s (seA"- 1 ), (3) 

aeA aeA 

where 

'-1, if S = Sl^S 2 , 

1, if s = s 2 ^si, 
0, otherwise, 

Proof. (0) <^4> ([III) : w = di . . . a n G S(s\, S2,n) and f = f w H and only if 

si = ax . . . a fc _i ->• 02 ... ajfc ->•■■■ ->• a n -k+2 ■ ■ ■ a n = s 2 

is an Eulerian path in Gf. 

dn|) 4^ (jmj) : This is well known. 

(lull) (livl) : (liv|l is just a reformulation of (lull) in terms of the function /. □ 

In the next lemma we consider the independence of homogeneous systems related to the 
equations (J3|) and ([2]). 

Lemma 2.13. Let xt, where t 6 A k , be m k unknowns. The system of equations 

x ** = Xsa ( s G Ak ~^ ^ 

aeA aeA 

is not independent, but all of its proper subsystems are. If we add the equation 

E xt = ( 5 ) 

t£A k 

to one of these independent systems, then the system remains independent. 
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Proof. The sum of the equations Q is a trivial identity Y2teA k x t = YlteA h x t> so ever y one of 
these equations follows from the other m k ~ 1 — 1 equations. If s±,S2 G A k ~ l are two different 
words, then x t = |si«2 U f° r an * is a solution of all the equations, except those with s = s\ 
or s = S2- This proves that all subsystems are independent. Addition of © keeps them 
independent, because x t = 1 for all t is a solution of the system dU but not of (0). □ 

Theorem 2.14. Let k > 1 and m > 2 be fixed numbers and let A be an m-letter alphabet. 

k fe — 1 

77ie number of k-Abelian equivalence classes of A n is G(n m _m ). 

Proo/. Let n > 2A; - 2, / : A fc ->• {0, . . . , n - + 1} and u, v G J 4 fc_1 . By Lemma [2T21 there 
is a word u; € S(u,v,n) such that f = f w only if / satisfies (J2j) and ([3|). Consider the system 
formed by these equations. The function f w satisfies the equations for every w € S(u,v,n), 
so the system has a solution. By Lemma [2. 131 the rank of the coefficient matrix of the system 
is m fc_1 , so the general solution of this system is of the form 



f{n) = a ijf( S j) + b i 



,m k - 1 ) 



where the words rj and Sj form the set A k and aij,b{ are rational numbers. Because < 

k k — 1 

f[Sj)<n — k + l, there are 0{n m ~ m ) possible functions /. 

Let u = v and consider the system of equations ([3]). By Lemma 12.131 the general solution 
of this homogeneous system is of the form 

m k — m fe_1 +l 

f{ n )= Yl "v/M (i = l J ...,m*- 1 -l) J (6) 

3=1 

where the words and Sj form the set A k and ajj are rational numbers. The coefficients dij 
do not depend on n. Let 



2^=1 



I \ ~ 1 1 1 III -f 1 I I -i ^ - ^ h — 1 -t 

max { > ,„._ n \ a ij\ | 1 < i < m — 1 



and let d be the least common multiple of the denominators of the numbers aij . Every constant 
function / satisfies the system of equations. In particular, f(t) = [n/2m k \ for all t is a solution 
of the system. If we let 



n 



71 



/(Sj)= 2^ Wh6re l^'^^t -1 and d|6 



2cm k 



j 3 , 



then the numbers 

m k — m k ~ 1 +l 



given by ([6]) are integers and 1 < /(t) < n/m k — 1 for all i G ^4 fe . Because _f(i) > 1 for all i, 
the underlying graph of Gf is connected, so by Lemma T2.12I there is a word w G S(u,v, \w\) 
such that f = fw Because f(t) < n/m k — 1 for all t, we get 



| to | = ^2 /(*) + fc-l<n-m fc + A;-l<n. 
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There are G(n m ~ rn +1 ) ways to choose the numbers by Every choice gives a different 
function f = f w for some w G S(u, v, \w\) such that \w\ < n. Let these words be wi, . . . , wjy- 
No two of them are fc-Abelian equivalent. Among these words there are at least N/n words 
of equal length. By Lemma 12. 11^ there are at least N/n words of length n such that no two 
of them are /c-Abelian equivalent, and N/n = Q,(n m " k ~ mk ). □ 



3. fc-Abelian complexity & periodicity 

In this section we prove that if Vw\no) < q^ k \no) for some k G Z + U {+00} and no > 1, 
then oj is ultimately periodic (see Corollary 13.31 below). For this purpose we introduce an 
auxiliary family of equivalence relations IZt on A* defined as follows: Let k G Z + U {+00}. 
Give u, v G A* we write uTZ^v, if and only if u ~i v (i.e., u ~ a f, v) and u and v share a common 
prefix and a common suffix of lengths k — 1. In case \u\ < k — 1, then uTZ^v means u = v. 

It follows immediately from Lemma 12.31 that 



u ~fc v =^ uR k v. (7) 

In general the converse is not true: For example, taking u = 0011 and v = 0101 we see 
that UIZ2V yet u and v are not 2-Abelian equivalent. However, in view of Proposition 12.81 we 
have: 

Corollary 3.1. Let u and v be two factors of a Sturmian word oj, and k £ Z + U {+00}. Then 
u ~fc v if and only if uTL^v. 

Let uj E A N . Associated to the relation TZ^ is a complexity function, denoted p^(n), which 
counts the number of distinct TZ k equivalence classes of factors of oj of length n. It follows 
from ([7]) above that for each n we have 

pl k \n)<Vi k \n). (8) 
We recall the function : N ->■ N (k G Z + U {+00}) defined by 



,(*) 



(n) 



n + 1 for n <2k 
2k for n > 2k 



Theorem 3.2. Let oj = a aia 2 ■ ■ ■ G ^4 N and k G Z + U {+00}. If pffl(no) < q( k \no) for some 
no > 1, then oj is ultimately periodic. 



Proof. The result is well known in case k = +00 (see [23]). For k G Z + , we proceed by 
induction on k. In case k = 1, then 7Z\ is simply the usual notion of Abelian equivalence and 
the result follows from 0]. 

Now suppose k > 1 and that (no) < (no) for some no > 1. It follows immediately 
from the definition oilZ^ that if uIZ^v and \u\ < 2/c — 1, then u = v. Thus, if pffl(no) < q^ k \no) 
where no < 2k — 1, then p w (no) < no + 1 and so w is ultimately periodic by the well known 



result of Morse and Hedlund in 23] . 



Thus we suppose that (no) < 2k for some no > 2k. We claim that oj must be ultimately 
periodic. Suppose to the contrary that oj is aperiodic. We shall show that this implies 
that p { v~ l \n Q -2) < 2(k - 1) where v = a Q 1 oj denotes the first shift of oj, i.e., the word 
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obtained from oj by removing the first letter of uj. Since uq — 2 > 2{k — 1) we deduce that 
Pu ^ {no ~2) < q( k ~ l \no — 2). But then by induction hypothesis on k, it follows that v (and 
hence uj) is ultimately periodic, a contradiction. 
Consider the map 

* : TUn )/TZ k — ► F u {n - 2)/K k ^ 

defined by 

9([avb] k ) = 

where a,b G A, and m £ 4* of length no — 2 Here denotes the lZ k equivalence class of 
u. To see that $ is well denned, suppose aublZ k cud. Then since k > 1, it follows that a = c 
and b = d and thus that uTZ\v. Moreover as au6 and cud share a common prefix and suffix 
of length k, it follows that u and v share a common prefix and suffix of length k — 1. Thus 
uTZk-iv as required. Clearly the mapping is surjective, in fact for each u G .F„(no — 2) there 
exist a, 6 G A such that ait& G ^{no). This is the reason for replacing cj by za 

We now show that either there exist distinct classes [it]fc_i, £ F u {no — 2)/TZk-i for 

which 

minlCard^-HMfc-i)) , Card (^(Hfc-i))} > 2, (9) 
or there exists a class G T v (n§ — 2)/lZ k _i for which 

Card^-^Mfc-!)) >3. (10) 

In either case it follows that 

Gard(Jv(no - 2)/K k -x) < Card (T u (no)/K k ) - 2 < 2{k - 1). 

Since oj is assumed to be aperiodic, uj contains both a left special factor of the form uc and 
a right special factor of the form dv of lengths uq — 1 for some choice of c, d G A and u, v G ^4*. 
Thus there exist distinct letters a, 6 G A such that cmc and buc are factors of oj. Moreover 
since a ^ b, it follows that [aucjfc ^ [buc]k- Thus Card (\E r ~ 1 (['u]fc_i)) > 2. Similarly, there exist 
distinct letters a',b' & A such that dva' and cfofe' are factors of to, and since a' ^ b', it follows 
that [dua']fc 7^ [dvb']k- Thus Card (^I /_1 ([w]fc_i)) > 2. In case [it]fc-i ^ Mfc-i, we obtain the 
desired inequality ([9]). In case = since a ^ b and a' ^ b' it follows that 

Card{ [auc] k , [buc]k, [dua'] k , [dub']k} > 3 

which yields the inequality fjlOf) This completes the proof of Theorem 13.21 □ 

Corollary 3.3. Let uj G A n and k G Z + U {+oo}. IfV { ^\n ) < g (fc) (n ) /or some n > 1 then 
uj is ultimately periodic. 

Proof. As a consequence of the inequality ©, if vL k \n ) < q (k) {n ) then p£ fc) (n ) < g (fc) (™o), 
whence by Theorem 13.21 it follows that uj is ultimately periodic. □ 

The same method of proof of Theorem 13.21 can be used to prove the following: 

Corollary 3.4. Let uj be a bi-infinite word over the alphabet A and k G Z + U {+oo}. If 
{no) < Q^{ n o) f or some hq > 1, then uj is periodic. 
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We conclude this section with a few remarks: 



Remark 3.5. In the special case k = +00, the condition given in Corollary 13.31 gives a char- 
acterization of ultimately periodic words by means of factor complexity: uj G is ultimately 
periodic if and only if p u (no) < no+1 for some uq > 1. However, Abelian complexity does not 
yield such a characterization. Indeed, both Sturmian words and the ultimately periodic word 
01°° = 0111 •• • have the same Abelian complexity. More generally, the ultimately periodic 
word o 2 ^ 1 !™ ... has the same A;- Abelian complexity as a Sturmian word (see Theorem 14.11 
below). 

Remark 3.6. The result of Corollary 13.41 is already known to be true in the special cases 
k = +00 (see j23|) and k = 1 (see Remark 4.07 in [4]). In these special cases, the converse is 
also true. But for general 2 < k < +00 the converse is false. For instance, let Card(^4) = 5, 

and let it be a word containing at least one occurrence of every i£i 3 . Let uj be the periodic 

(2) 

word uj = ... uuuu .... Then (n) > 5 for every n > 1 . 

4. A;- Abelian complexity of Sturmian words 

In this section we determine the /c-Abelian complexity of Sturmian words and show that 
for each k, the complexity function completely characterizes Sturmian words amongst all 
aperiodic words. More precisely: 

Theorem 4.1. Fix k G Z + U {+00}. Let U) G A N be an aperiodic word. The following 
conditions are equivalent: 

• uj is a balanced binary word, that is, Sturmian. 

s (k) , v fn+1 for0<n<2k-l 

• Vhi 'in) = q {Kl Hn) = { 

\2k forn>2k 

Our proof of Theorem 14.11 will make use of the following functions gi, which transform 
binary words by changing the letters around a specific point. For words w G {0, l} n we define 
gi, . . . ,g n as follows: 




ulOv , if % < n, w = uOlv and |u0| = i, 
ul, if % = n and w = uO. 



Lemma 4.2. Let n > 1 and let w G {0, 1} W be Sturmian. There is a word u\ G {0, l} n and 
a permutation a of {1, ... ,n} such that if Ui+\ = g„u\ (t*i) for i = 1, . . . , n, then u±, . . . , u n +i 
are the factors of w of length n. 

Proof. Let u%, . . . ^u n +i be the factors of w of length n in lexicographic order. If follows from 
Theorem 1.1. in [2j that for every i there is an m such that Ui+\ = gm(ui). It needs to be 
proved that the m's are all different. Let Ui + \ = g m (ui) and Ui> + i = g^u'j). For every j 

|pref m (uj)|i < |pref m (u j+ i)[i 

and for j G {i,i'} 

|pref m («i)|i < |pref m (u J+ i)|i. 
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If i 7^ i', then 

|pref m (ui)|i + 2 < |pref m (n n+ i)|i 
which contradicts the balance property of Sturmian words. □ 
Example 4.3. The factors of the Fibonacci word of length six are 

ui =001001, u 2 = 001010 = gs(ui), u 3 = 010010 = g 2 {u 2 ), u 4 = 010100 = 4 (u 3 ), 
-u 5 = 100100 = ff i(u 4 ), u 6 = 100101 = 56(« 5 ), u 7 = 101001 = ^K). 

We have u 2 ~2 ^3 ~2 n 4 anc ^ n 6 ~2 u 7- There are no other 2-Abelian equivalences between 
these factors. 

Proof of Theorem \4-l\ First let us suppose uj G {0, 1} N is Sturmian and let 1 < k < +oo. Let 
n < 2k — 1. By Lemma 12.41 two factors u and v of w of length n are /c-Abelian equivalent if 
and only u = v. Thus Vw\n) = n + 1 as required. 

Next let n > 2k and let iti, . . . , ii n +i and <r be as in Lemma 14.21 If k < <r(i) < n — k, then 
there are words s,i £ {0, 1}* and u,v £ {0, l} fc_1 and letters a, 6 S {0, 1} so that U{ = suOlvt 
and U{+i = g a (i){ui) = suWvt. We prove that in ~fc ti»+i. The prefixes and suffixes of Uj and 
Ui+l of length k — 1 are the same. The factors of Ui of length fc are the factors of su, uOlv 
and vt of length fc, and the factors of Ui+i of length k are the factors of su, uWv and vt of 
length fc. Because uOlv and nlOu are factors of w, it follows that u is right special and v is left 
special and hence equal to the reversal of u. By Theorem 12.61 uOlv and uWv are fe-Abelian 
equivalent. This proves that n, ~^ if A; < o~(i) < n — k. Thus the words ui, ... ,ii n +i 
are in at most 2k different fc-Abelian equivalence classes and Vu{n) < 2A;. By Corollary 13.31 
vL k \n) = 2k. 

Next let 1 < k < +oo and let ui G j4 n be aperiodic and 



p(f)(n) = g( fe )(n) 



n + 1 for < n < 2k - 1 
2k for n > 2A; 



Taking n = 1 we see that w is binary, (say u G {0, 1} N ). We must show that co is balanced. 



We first recall some basic facts concerning factors of Sturmian words (see for instance 281]): 
Let r\ € {0, 1} N be a Sturmian word, and let .F„(n) denote the factors of 77 of length n. The 
set J^(n + 1) is completely determined from the set T v {n) unless rj has a bispecial factor B 
of length n — 1 in which case both 0-B and IB are factors of 77 and exactly one of the two 
is right special. If 0B is right special, then every occurrence of IB in 77 is an occurrence of 
1B0. If v is a factor of 77 and u a prefix of v, we write u h tj if every occurrence of n in 77 is 
an occurrence of tj. Thus if 0-B is right special, then IB h ISO, and similarly if IB is right 
special, then 0B h 0B1. 

Now suppose to the contrary that the aperiodic binary word uj is not Sturmian. Then 
there exists a smallest positive integer n > 1 and a Sturmian word 77 such that ^(n) = J-^{n) 
but J-u){n + 1) 7^ J~ v '(n + 1) for every choice of Sturmian word 77'. This means that w has a 
bispecial factor 1? of length n — 1 and both 0-B and 1-B are in T w {n) and one of the following 
must occur: i) Neither 0-B nor IB is right special in uj; ii) There exists a unique a G {0, 1} such 
that aB is right special, and (1 — a)B h (1 — a)-B(l — a); iii) Both 0-B and 1.B are right special 
in uj. We will show that since uj is aperiodic, only case iii) is in fact possible. Clearly, if neither 
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OB nor IB were right special, then Card(J r w (n)) = Card(J r w (n + 1)) whence uj is ultimately 
periodic, a contradiction. Next suppose case ii) occurs. We may suppose without loss of 
generality that OB is right special and IB h 1B1. If 1 h IB (and hence 1 h 1-B1), then we 
would have 1 h l(Sl) n for every n > 1 from which it follows that the tail of uj corresponding 
to the first occurrence of 1 on uj is periodic. Thus if -i(l h IB), then there exists a bispecial 
factor £?' of uj with < |J5'| < \B\ such that 1.B' is right special and IB'1 h IB and hence 
h 1B1. Writing 1B1 = IB' IV we have IB'1 h IB' IV. We next show by induction on n 
that !B'lV n is a palindrome for each n > 1. Clearly this is true for n = 1 since lB'lV = 1B1. 
Next suppose lB'lV n is a palindrome. Then 

LBOT^+T = F™ +1 LB 7 T = VV n TWl = VlB'lV n = VTWlV n = lB'lVV n = lB'lV n+1 . 

Having established that lB'lV n is a palindrome, it follows that IB'1 is a suffix of lB'lV n 
and hence lB'lV n h lB'lV n+1 for each n > 0. Whence as before w is ultimately periodic. 
Thus if cj is not Sturmian, case iii) must occur. This implies that 

F u (n + 1) = ^(n + 1) U {0B0, 1B1} 

and Card(J r r? (n + 1) n {0B0, 1-B1}) = 1. Since rj is Sturmian, the number of fc-Abelian classes 
of factors of r\ of length n + 1 is equal to q^ k \n + 1). But the additional factor aBa of u 
of length n + 1 introduces a new /c-Abelian class since it is not even Abelian equivalent to 
any other factor of rj (and hence uj) of length n + 1. Thus Vw (n + 1) = q^ k \n + 1) + 1, a 
contradiction. Thus ui is Sturmian. 

□ 

Remark 4.4. In view of Corollary 13.31 within the class of aperiodic words, Sturmian words 
have the lowest possible /c-Abelian complexity. See [H 
Sturmian words have the lowest complexity amongst all aperiodic words. 
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for other instances in which 



5. Bounded k- Abelian complexity & k- Abelian repetitions 



There is great interest in avoidability of repetitions in infinite words. This originated with 
the classical work of Thue 3l| and 32], in which he established the existence of an infinite 



binary (resp. ternary) word avoiding cubes (resp. squares). It was later shown that to avoid 
Abelian cubes or Abelian squares, one needs 3-letter or 4-letter alphabets respectively (see 
0] and The corresponding problems for fc-abelian repetitions turned out to be quite 

nontrivial. It follows easily that the smallest alphabet where fc-abelian cubes can be avoided 
is either 2 or 3, and similarly the smallest alphabet where fc-abelian squares can be avoided 
is either 3 or 4. In the latter case for k = 2 a computer verification revealed that the correct 
value is 4, as in the case of Abelian repetitions: Each ternary 2-abelian square-free word is of 
length at most 536 [12| . In the former case computer verification shows that there exist binary 
words of length 100000 which are 2-abelian cube-free fx3 ] . It is still unknown whether there 
exists an infinite binary word which is 2-abelian cube-free. For some larger values of k such 
infinite words exist. In the case of binary alphabets and cubes it was shown in a sequence of 
papers that an infinite word avoiding /c-abelian cubes can be constructed for k = 8, k = 5 
and for k = 3 (see 13], [i3] and 21] respectively). So only the value k = 2 remains open. It 
would be extremely surprising if no such infinite words exist. For avoiding fc-abelian squares 
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in a ternary alphabet the situation is equally challenging. We know that for k = 3 there 
exist words of length 100000 avoiding 3-abelian squares. The avoidability in infinite words of 
£;-abelian squares in a ternary alphabet is only known for large values of k (k > 64) (see [HI]). 

In this section we prove that fc-Abelian repetitions are unavoidable in words having bounded 
/c-Abelian complexity. For each positive integer k we set 

A^ k = {x G A* : \x\ < k}. 

Given an infinite word uj = ao a i°2 ■ • ■ G A^, for each < i < j < +oo we denote by uj[i, j] the 
factor OjOj+i • • • a,j. 

Definition 5.1. Let k and B be positive integers and uj G A^. We say uj is (k, B) -balanced 
if and only if for all factors u and v of uj of equal length, and for all x G A- k we have 
I Hz — Ms | < B. We say uj is arbitrarily k-imbalanced if uj is not (k, B) -balanced for any 
positive integer B. 

An elementary, but key observation is that 

Lemma 5.2. Let k be a positive integer anduj G A N . Then uj has bounded k-Abelian complexity 
if and only if uj is (k, B)-balanced for some positive integer B. 

Proof. Clearly if Vw is bounded, say by B, then uj is (k, B — l)-balanced. Conversely, if uj is 
(k, -B)-balanced, then for each positive integer n and for each x & A* with [ar| < k we have 

Card{|n| a . : u G ^(n)} < B + 1. 

It follows that 

V^{n)<(B + l) K 

where K = Card>l- fc . 

□ 

Fix a positive integer k. It follows from Theorem 14 . 1 1 and Lemma 15.21 that each Sturmian 
word is (k, £>)-balanced for some positive integer B (depending on k.) Actually, I. Fagnot and 
L. Vuillon proved in [9| that every Sturmian word is (k, /c)-balanced. 

Definition 5.3. Fix k E Z + U {+oo}, and N a positive integer. By a k-Abelian N-power we 
mean a word U of the form U = U\Ui ■ ■ ■ Un such that Ui ~fc Uj for all 1 <i,j < N. 

In this section we shall prove the following result: 

Theorem 5.4. Fix k G Z + U {+oo}. Let uj = a$a\a2 ■ ■ ■ G A N be an infinite word on a finite 
alphabet A having bounded k-Abelian complexity. Let D C N be a set of positive upper density, 
that is 

Card (DH {1,2,..., n}) 
hm sup > 0. 

n— >oo Tl 

Then, for every positive integer N , there exist i and i such that {i,i+l,i + 2l, . . . ,i+£N} C D 
and the N consecutive blocks (uj[i+j£, i+(j+l)£— l])o<j<7V-i of length £ are pairwise k-Abelian 
equivalent. In particular, uj contains arbitrarily high k-Abelian powers. 
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Remark 5.5. The result in Theorem 15.41 is already known in the special case of D = N and 



k = +00 and k = 1 (see [23J and [27J respectively). 



Before proving Theorem 15.41 we give some immediate consequences: 

Corollary 5.6. Let k and N be positive integers, and oj an infinite word avoiding k-Abelian 
N -powers. Then oj is arbitrarily k-imbalanced. 

Proof. This follows immediately from Lemma 15.21 and Theorem 15.41 □ 

Corollary 5.7. Let oj be a Sturmian word. Then oj contains k-Abelian N-powers for all 
positive integers k and N. 

Proof. This follows immediately from Theorems 14.11 and 15.41 in fact the /c-Abelian complexity 

(k) 

Vtj is bounded (by 2k) for each positive integer k. □ 

Remark 5.8. It is known that a Sturmian word oj contains an iV-power for each positive 
integer N if and only if the sequence of partial quotients in the continued fraction expansion 
of the slope of oj is unbounded. So, a Sturmian word whose corresponding slope has bounded 
partial quotients (e.g., the Fibonacci word) will not contain iV-powers for N sufficiently large 



(e.g., the Fibonacci word contains no 4-powers 17J, [22]]). However, every Sturmian word will 
contain arbitrarily high /c-Abelian powers. 

Our proof of Theorem 15 . 41 will make use of the following well known result first conjectured 
by Erdos and Turan and later proved by to E. Szemeredi: 



Theorem 5.9. [Szemeredi's theorem [30]] Let D C N be a set of positive upper density. Then 
D contains arbitrarily long arithmetic progressions. 

Proof of Theorem \5.4\ Let D C N be a set of positive upper density. First we consider the 



case k = +00. By assumption T' l ^ rOC '\n) is bounded. This is equivalent to saying that oj has 
bounded factor complexity. It follows by Morse-Hedlund that oj is ultimately periodic, i.e., 
u = UV°° for some U, V € A*. For each i > 0, set A = D n {i + j\V\ : j = 1, 2, 3, . . .}. Pick 
i > \U\ such that the set D{ has positive upper density. Then an arithmetic progression of 
length N + 1 in Di (guaranteed by Szemeredi's theorem) determines the iVth power of some 
cyclic conjugate of V. 

Next let us fix positive integers k and N and assume that (n) is bounded. It follows 
by Lemma 15.21 that oj is (k, l?)-balanced for some positive integer B. We recall the following 
lemma proved in [271 ] 



Lemma 5.10. [Lemma 5.4 in 27]] Let k and B be positive integers. There exist positive 



integers a x for each x G A- k and a positive integer M such that whenever 

c x a x = (mod M) 

for integers c x with \c x \ < B for each x € A- k , then c x = for each x £ A- k . 
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Set 



V = (D-l)n{fc,fc + l,fc + 2...}. 



Then T> is of positive upper density. We now define a finite coloring 

►{0,1,2,..., M-l} x F u (2k) 

as follows 

®( n ) =M ^ n ] U«a; (modM) ; u;[n — A; + 1, n + A;] J 

\x€A< k J 

where a x and M are as in Lemma [5.101 Note that the second coordinate of $(n) is the suffix of 
length 2k of ui[l, n + A;]. We note also that if <3?(m) = 3>(n) for some m < n, then by considering 
the first coordinate of <E> one has 

X! M 1 )"]!^ - Ml> H Uaz = (modM) (11) 

x^A< k x^A< k 

(|w[l,n]| a .-Hl,m]| x )a a . = (modM) (12) 

x£A< k 

^2 M m ~ \x\ + 2,n]\ x a x = (modM). (13) 

xeA< k 

$ defines a finite partition of T> where two elements r and s in T> belong to the same class 
of the partition if and only if <E»(r) = ^(s). Clearly at least one class of this partition of T> 
has positive upper density. Thus by Szemeredi's theorem, there exist positive integers r and 
t with r > k such that 

{r, r + t, r + 2i, . . . , r + TVt} C £> 

and 

$(r) = $(r + t) = $(r + 2i) = • • • = $(r + 2Vi). 
We now claim that the N consecutive blocks of length t 

u[r + 1, r + tM r + t + l,r + 2i]o/[r + 2t + 1, r + 3i] . . . u[r + (N - l)t + 1, r + Nt] 

are pairwise A>Abelian equivalent. This would prove that lv contains a A>Abelian A^power in 
position r + 1 £ D. 

To prove the claim, let < i, j < N — 1. We will show that 

uj[r + it + 1, r + (* + l)t] ~fc w[r + # + 1, r + (j + l)t]. 

By (]13p first taking n = r + (i + l)t and m = r + it, then n = r + (j + l)t and m = r + jt 

M r + it -\ x \+ 2 , r +( i + 1 ) t \\x (X x = Yl M r +i*-M+2, r +(.? + 1 )*]Uaz = (modM) 

xeA< fc x&A^ k 
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and hence 

(Mr + it - \x\ + 2,r + (i + l)t]\ x - \u[r + jt - \x\ + 2,r + (j + l)t]\ x ) a x = (mod M). 
But since 

\oj[r + it - \x\ + 2,r + (i + l)i]| = \u[r + jt - \x\ + 2,r + (j + l)t] \ = \x\+t-l 
and uj is (£;, l?)-balanced, it follows that 

\\cj[r + it - |x| + 2,r + (i + l)t]\ x - \u[r + jt - \x\ + 2,r+(j + l)t]\ x \ < B 
whence by Lemma 15.101 we deduce that for each x € A- k 

\u[r + it - \x\ + 2,r + (i + l)t]\ x = \oj[r + jt - \x\ + 2,r + (j + l)t]\ x . (14) 
Since 3>(r + it) = $(r + jt), the second coordinate of <£> gives 

oj [r + it — k + 1 , r + it + k] = uj [r + jt — k + 1 , r + + k] . 
Together with (|14|) we deduce that for each x £ 

|cj[r + ii + l,r + (i + l)t]\ x = \u[r + jt + 1, r + {j + l)t]\ x . 

In other words 

u[r + it + l,r + (i + l)t] ~ fc w[r + jt + 1, r + (j + l)t] 
as required. This completes our proof of Theorem 15.41 

□ 
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